2023-05-01

Biology 395, Advanced Bioassessment

Everything will be in R for your reference! I will have this presentation, as well as a PDF version you’ll be able to take with you.

Biology 395

Welcome to Biology 395, Advanced Bioassessment. In this course we’ll:

  • Review the basis of Bioassessment (Think conservation biology and genetics)
  • Review Run off sources
  • Review Water Quality monitoring its uses, and how to gather these data.
  • Review Biological Monitoring methods.
  • Review the basis of Indices of Biological Integrity
  • Gather and analyze VSCI data from a local Appomattox tributary.
  • eDNA now and in the future.

How will this class be formatted?

Mainly, you’ll be taking a look at and reviewing my lectures outside of class in Canvas. Once you review those, you’ll come to class and we’ll practice what you’ve looked at in the lectures prior to class. There will be two exams, one lab report, and lots of small assignments based in R and that will total up your grade.

General outcomes:

How will this class be formatted?

Continued

Specific outcomes:

  • Be able to import and manipulate data in R
  • Assess a stream for physical habitat
  • Assess a stream for water quality
  • Perform Biological Monitoring
  • Analyze these data in R
  • Discuss the use case(s) of eDNA

Course Theory

This course is meant to be modeled after how a professional biologist’s year. It will include a field season toward the beginning of the semester and will analyze data once the weather isn’t so conducive to acquiring data.

August 21

Preview of Course

August 21

Preview of Course

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.2     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.2     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggpubr) 
library("FactoMineR")
library("factoextra")
## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa

August 21

Preview of Course

You’ll see “chunks” like these throughout the course. This is actual R code, and is meant to be a guide/reference to you so you can take it with you and use as the basis of your own code in the future. These HTML slides are what you’ll see in class, but I will give you a PDF document of all the code at the end of the semester. I can also give you particular lectures in PDF format upon your request.

August 21

Preview of Course

Today, we’re going to take a look at a brief overview of what’s in the course starting with conservation biology/genetics.

knitr::include_graphics("Conservationgenetics_model.jpeg")

# ![](/Users/jamesonhinkle/Desktop/folders/Longwood/Bio395/Conservationgenetics_model.jpeg) 

August 21

Preview of Course

We’ll look into the details of run off sources and how they’re monitored, mainly in Virginia via the USGS and DEQ.

knitr::include_graphics("Runoff_sources.png")

August 21

Preview of Course

We’ll also talk in a lot of depth about biological monitoring. This is really at the center of the course as this completely relies on our ability to generate indices of biological integrity.

August 21

Preview of Course

Speaking of Indices of Biological Integrity (IBI), we’ll talk in depth about how these are generated, and how they’re applied. Indices of biological integrity have been around now for decades. These include but are not limited to:

The links above are really just references for you. We’ll talk in more detail about these later.

August 21

Preview of Course

As indices developed, multimetric indices of biological integrity were developed and requested by the EPA so that states could have their own metrics and define what is impaired streams under the 303(d) section of the Clean Water Act.

  • Virginia developed it’s own IBI known as the Virginia Stream Condition Index VSCI that we’ll discuss in depth.

  • A need was also identified for a multimetric index for streams that are not of a “higher gradient.” Research to develop this index through the mid-2000s produced the Coastal Plain Macroinvertebrate Index CPMI

We will talk about both of these in detail and learn how to analyze basic VSCI and CPMI datasets in R.

August 21

Preview of Course eDNA

eDNA or environmental DNA is a tool used in lieu of doing physical biologial assessment. It can be used to detect cryptic, invasive, or endangered species without harming them with any sort capture method. The method isolates DNA from a number of “matrices” namely: - water - soil - bone/excrement

knitr::include_graphics("edna_image.jpg")

August 21

Preview of Course eDNA cont’d

August 23

Phab - Physical habitat assessment

August 23

Physical habitat assessment

One of the first assessments necessary when working on a field site for any biomonitoring effort, is to assess physical habitat. The EPA developed rapid bioassessment tools that include physical habitat monitoring.

In normal site visits, you would look at each aspect of biomonitoring (Phab, fish, macroinvertebrates, water quality) all at the same time, but since we’re learning we’re mostly going to take it one at a time, starting with physical habitat.

August 23

Physical habitat assessment

Subjectivity is a real issue here. As each scientist could assess physical habitat differently. The EPA as a result made a standarized protocol in an effort to decrease variability.

knitr::include_graphics("EPA_physicalhabitatprotocol.png")

August 23

Physical habitat assessment

August 23

Physical habitat assessment

As a result, our aims in this presentation will be three fold:

  • Review Physical habitat assessment methods
  • Go into field and perform Physical habitat survey on our field site for the semester
  • Analyze Physical habitat results from empirical state data.

This lecture is meant to review Physical habitat methods, and next week, we’ll go into the field. When we introduce R, I’ll show you how to import data, and then we’ll do a brief “analysis.”

August 23

Physical habitat assessment

There are 10 Physical characteristics of streams we will look at. Each slide will represent one of the ten aspects, and we’ll briefly discuss before going into the field. -Epifaunal Substrate and Available Cover.

knitr::include_graphics("epifaunalsubstrate:availablecover.png")

August 23

Physical habitat assessment

  • Embeddedness
knitr::include_graphics("embeddedness.png")

August 23

Physical habitat assessment

  • Pool Substrate Characterization
knitr::include_graphics("PoolSubstarte.png")

August 23

Physical habitat assessment

  • Velocity Depth Combinations
knitr::include_graphics("VelocityDepth.png")

August 23

Physical habitat assessment

  • Pool Variability
knitr::include_graphics("Poolvariability.png")

August 23

Physical habitat assessment

  • Sediment Deposition
knitr::include_graphics("Sedimentdeposition.png")

August 23

Physical habitat assessment

  • Channel Flow Status
knitr::include_graphics("ChannelFlowStatus.png")

August 23

Physical habitat assessment

  • Channel Alteration
knitr::include_graphics("ChannelAlteration.png")

August 23

Physical habitat assessment

  • Frequency of Riffles
knitr::include_graphics("Frequencyriffles_bends.png")

August 23

Physical habitat assessment

  • Channel Sinuosity
knitr::include_graphics("ChannelSinusosity.png")

August 23

Physical habitat assessment

  • Bank Stability
knitr::include_graphics("BankStability.png")

August 23

Physical habitat assessment

  • Bank Vegetatitive Protection
knitr::include_graphics("BankVegetativeProtection.png")

August 23

Physical habitat assessment

  • Riparian Vegetative Zone Width
knitr::include_graphics("RiparianBufferZone.png")

August 23

Physical habitat assessment

The next steps will be actually acquiring these data at our field site (Buffal Creek - check). We’ll gather the data for the 10 characteristics we discussed and talk about what we think is good and/or bad for QA/QC purposes.

After that, during our R introduction, I will attempt to have you perform a small analysis of Physical habitat data from Virginia DEQ to see if we can look at some basic statistics and perceive interesting about the data.

August 30

Introduction to R

August 30

Introduction to R

  • R is both a language and an interface for statistical analysis, programming, and graphics. R has become a standard interface for statistical analysis in biological sciences due in part to its openness, ability to be extended by users, and it vibrant user base. As a statistical analysis platform, R has its own grammar and in this activity you will begin to understand how to use and interpret R.

To get R and Rstudio (you’ll need both), you can go here

Please have this by next class (Aug 30). This will be one of your assignments.

August 30

Introduction to R

  • R itself consists of an underlying engine that takes commands and provides feedback on these commands. Each command you give the R engine is either an:

  • Expression An expression is a statement that you give the R engine. R will evaluate the expression, give you the answer and not keep any reference to it for future use. Some examples include:

2 + 6
## [1] 8
sqrt(5)
## [1] 2.236068
3 * (pi/2) - 1
## [1] 3.712389

August 30

Introduction to R

-Assignment

An assignment causes R to evaluate the expression and stores the result in a variable. This is important because you can use the variable in the future. An example of an assignment is:

x <- 2 + 6
myCoolVariable <- sqrt(5)
another_one_number23 <- 3 * (pi/2) - 1
x
## [1] 8
myCoolVariable
## [1] 2.236068
another_one_number23
## [1] 3.712389

August 30

Introduction to R

  • Functions

There are thousands of potential functions in R and its associated packages. To use these functions, you need to understand the basic taxonomy of a function. A function has two parts: - A unique name, and - The stuff (e.g., variables) passed to it within the parentheses.

August 30

Introduction to R

Not all functions need any additional variables. For example, the function ls() shows which variables R currently has in memory and does not require any parameters. If you forget to put the parentheses on the function and only use its name, by default R will show you the code that is inside the function (unless it is a compiled function). This is because each function is also a variable. This is why you should not use function names for your variable names (see below for more on naming).

August 30

Introduction to R

To find the definition of a function, the arguments passed to it, details of the implementation, and some examples, you can use the ? shortcut. To find the definition for the sqrt() function type ?sqrt and R will provide you the documentation for that function.

August 30

Introduction to R

Functions may have more than one parameter passed to it. Often if there are a lot of parameters given then there will be some default values provided. For example, the log() function provides logarithms. The definition of the log function show log(x, base=exp(1)) (say from ?log). Playing around with the function shows:

log(2)
## [1] 0.6931472
log(2, base = 2)
## [1] 1
log(2, base = 10)
## [1] 0.30103

August 30

Introduction to R

R recognizes over a dozen different types of data. All of the data types are characterized by what R calls classes. To determine the type of any variable you can use the built-in function class(x). This will tell you what kind of variable x is. What follows are some of the more common data types.

August 30

Introduction to R

  • Numeric

Numeric types represent the majority of numerical valued items you will deal with. When you assign a number to a variable in R it will most likely be a numeric type. Numeric data types can either be displayed with or without decimal places depending if the value(s) include a decimal portion. In fact, R will make any assignment of a numerical value a numeric by default. For example:

August 30

Introduction to R

x <- 4
class(x)
## [1] "numeric"
x
## [1] 4
x <- numeric(4)
x
## [1] 0 0 0 0

August 30

Introduction to R

x[1] = 2.4
x
## [1] 2.4 0.0 0.0 0.0

August 30

Introduction to R

Notice this is an all or nothing deal here, each element of a vector must be the same type and the de- fault value for a numeric data types is zero. Also notice (especially those who have some experience in programming other languages) that dimensions in vectors (and matrices) start at 1 rather than 0. Operations on numeric types proceed as you would expect but since the numeric type is the default type, you don’t really have to go around using the as.numeric(x) function. For example:

August 30

Introduction to R

is.numeric(2.4)
## [1] TRUE
as.numeric(2) + 0.4
## [1] 2.4
2 + 0.4
## [1] 2.4

August 30

Introduction to R

  • Numeric

Word of Caution, It is important to point out here that you need to be rather careful when dealing with floating point numbers due in part to the way in which computers store these numbers and how they are presented to us in the R interface as well as when we need to perform logical operations on them. Consider the following case. The ancient Egyptians had an approach to calculating π as the ratio of 256/81.

e.pi <- 256/81
e.pi
## [1] 3.160494

August 30

Introduction to R

  • Numeric

Word of Caution cont’d: Very nice and apparently pretty close to 3.1416 so that they could get work done. Now, as we all know, the value of π is the ratio of a circle’s circumference to its diameter. We also know that it is a transcendeental number (e.g., on that cannot be produced using finite algebraic operations) and its decimal values never repeat.

print(e.pi, digits = 20)
## [1] 3.1604938271604936517

August 30

Introduction to R

  • Numeric

Word of Caution cont’d: There is another issue that you need to be careful with. You need to be considerate of how a computer stores numerical values. Consider the following:

x <- 0.3/3
x
## [1] 0.1
print(x, digits = 20)
## [1] 0.099999999999999991673

August 30

Introduction to R

  • Numeric

Why the difference? A computer deals in binary (0/1) representations and as such has a limited ability for precision, particularly for very large or very small numbers. Usually this does not cause much of a problem, but when you begin to work at crafting analyses, you should be aware of this drawback.

August 30

Introduction to R

  • Character

The character data type is the one that handles letters and letter-like representations of numbers. For example, observe the following:

x <- "If you can read this, you are beginning to take a step into a larger world."
class(x)
## [1] "character"
length(x)
## [1] 1

August 30

Introduction to R

  • Character

Notice here how the variable x has a length of one, even though there are 37 characters within that string. If you want to know the number of characters, you need to use the nchar() function, otherwise it will tell you the ’vector length’ (see below) of the variable.

y <- 23
class(y)
## [1] "numeric"
z <- as.character(y)
z
## [1] "23"
class(z)
## [1] "character"

August 30

Introduction to R

  • Character

Notice how the variable y was initially designated as a numeric type but if we use the as.character(y) function, we can coerce it into a non-numeric representation of the number. Combining character variables can be done using the paste() function to ’paste together’ a string of char- acters (n.b., notice the optional sep argument).

August 30

Introduction to R

w = "cannot"
x = "I"
y = "can"
z = "code in R"

paste(x, w, z)
## [1] "I cannot code in R"
paste(x, y, z)
## [1] "I can code in R"

August 30

Introduction to R

  • Constants